Lecture Translator - Speech translation framework for simultaneous lecture translation
نویسندگان
چکیده
Foreign students at German universities often have difficulties following lectures as they are often held in German. Since human interpreters are too expensive for universities we are addressing this problem via speech translation technology deployed in KIT’s lecture halls. Our simultaneous lecture translation system automatically translates lectures from German to English in real-time. Other supported language directions are English to Spanish, English to French, English to German and German to French. Automatic simultaneous translation is more than just the concatenation of automatic speech recognition and machine translation technology, as the input is an unsegmented, practically infinite stream of spontaneous speech. The lack of segmentation and the spontaneous nature of the speech makes it especially difficult to recognize and translate it with sufficient quality. In addition to quality, speed and latency are of the utmost importance in order for the system to enable students to follow lectures. In this paper we present our system that performs the task of simultaneous speech translation of university lectures by performing speech translation on a stream of audio in real-time and with low latency. The system features several techniques beyond the basic speech translation task, that make it fit for real-world use. Examples of these features are a continuous stream speech recognition without any prior segmentation of the input audio, punctuation prediction, run-on decoding and run-on translation with continuously updating displays in order to keep the latency as low as possible.
منابع مشابه
Unsupervised Acoustic Model Training for Simultaneous Lecture Translation in Incremental and Batch Mode
In this work the theoretical concepts of unsupervised acoustic model training and the application and evaluation of unsupervised training schemes are described. Experiments aiming at speaker adaptation via unsupervised training are conducted on the KIT lecture translator system. Evaluation takes place with respect to training e ciency and overall system performance in dependency of the availabl...
متن کاملConstruction of Chunk-Aligned Bilingual Lecture Corpus for Simultaneous Machine Translation
Abstract With the development of speech and language processing, speech translation systems have been developed. These studies target spoken dialogues, and employ consecutive interpretation, which uses a sentence as the translation unit. On the other hand, there exist a few researches about simultaneous interpreting, and recently, the language resources for promoting simultaneous interpreting r...
متن کاملUnsupervised vocabulary selection for simultaneous lecture translation
In this work, we propose a novel method for vocabulary selection which enables simultaneous speech recognition systems for lectures to automatically adapt to the diverse topics that occur in educational and scientific lectures. Utilizing materials that are available before the lecture begins, such as lecture slides, our proposed framework iteratively searches for related documents on the World ...
متن کاملUnsupervised Vocabulary Selection for Domain-Independent Simultaneous Lecture Translation
In this work, we investigate methods to automatically adapt our simultaneous lecture translation systems to the diverse topics that occur in educational lectures. Utilizing materials that are available before the lecture begins, such as lecture slides, our proposed framework iteratively searches for related documents on the World Wide Web and generates lecture-specific models and vocabularies b...
متن کاملOpen Domain Speech Translation: From Seminars and Speeches to Lectures
This paper describes our ongoing work in open domain speech translation. We describe how we developed a lecture translation system by moving from speech translation of European Parliament Plenary Sessions and seminar talks to the open domain of lectures. We started with our speech recognition and statistical machine translation 2006 evaluation systems developed within the framework of TC-Star (...
متن کامل